77 research outputs found

    Peaks detection and alignment for mass spectrometry data

    Get PDF
    The goal of this paper is to review existing methods for protein mass spectrometry data analysis, and to present a new methodology for automatic extraction of significant peaks (biomarkers). For the pre-processing step required for data from MALDI-TOF or SELDI- TOF spectra, we use a purely nonparametric approach that combines stationary invariant wavelet transform for noise removal and penalized spline quantile regression for baseline correction. We further present a multi-scale spectra alignment technique that is based on identification of statistically significant peaks from a set of spectra. This method allows one to find common peaks in a set of spectra that can subsequently be mapped to individual proteins. This may serve as useful biomarkers in medical applications, or as individual features for further multidimensional statistical analysis. MALDI-TOF spectra obtained from serum samples are used throughout the paper to illustrate the methodology

    On Simulation of Manifold Indexed Fractional Gaussian Fields

    Get PDF
    To simulate fractional Brownian motion indexed by a manifold poses serious numerical problems: storage, computing time and choice of an appropriate grid. We propose an effective and fast method, valid not only for fractional Brownian fields indexed by a manifold, but for any Gaussian fields indexed by a manifold. The performance of our method is illustrated with different manifolds (sphere, hyperboloid).

    Pairwise likelihood estimation for multivariate mixed Poisson models generated by Gamma intensities

    Get PDF
    Estimating the parameters of multivariate mixed Poisson models is an important problem in image processing applications, especially for active imaging or astronomy. The classical maximum likelihood approach cannot be used for these models since the corresponding masses cannot be expressed in a simple closed form. This paper studies a maximum pairwise likelihood approach to estimate the parameters of multivariate mixed Poisson models when the mixing distribution is a multivariate Gamma distribution. The consistency and asymptotic normality of this estimator are derived. Simulations conducted on synthetic data illustrate these results and show that the proposed estimator outperforms classical estimators based on the method of moments. An application to change detection in low-flux images is also investigated

    On Fractional Gaussian Random Fields Simulations

    Get PDF
    To simulate Gaussian fields poses serious numerical problems: storage and computing time. The midpoint displacement method is often used for simulating the fractional Brownian fields because it is fast. We propose an effective and fast method, valid not only for fractional Brownian fields, but for any Gaussian fields. First, our method is compared with midpoint for fractional Brownian fields. Second, the performance of our method is illustrated by simulating several Gaussian fields. The software FieldSim is an R package developed in R and C and that implements the procedures on which this paper focuses

    On Simulation of Manifold Indexed Fractional Gaussian Fields

    Get PDF
    To simulate fractional Brownian motion indexed by a manifold poses serious numerical problems: storage, computing time and choice of an appropriate grid. We propose an effective and fast method, valid not only for fractional Brownian fields indexed by a manifold, but for any Gaussian fields indexed by a manifold. The performance of our method is illustrated with different manifolds (sphere, hyperboloid)

    Nonparametric Pre-Processing Methods and Inference Tools for Analyzing Time-of-Flight Mass Spectrometry Data

    Get PDF
    The objective of this paper is to contribute to the methodology available for extracting and analyzing signal content from protein mass spectrometry data. Data from MALDI-TOF or SELDI-TOF spectra require considerable signal pre-processing such as noise removal and baseline level error correction. After removing the noise by an invariant wavelet transform, we develop a background correction method based on penalized spline quantile regression and apply it to MALDI-TOF (matrix assisted laser deabsorbtion time-of-flight) spectra obtained from serum samples. The results show that the wavelet transform technique combined with nonparametric quantile regression can handle all kinds of background and low signal-to-background ratio spectra; it requires no prior knowledge about the spectra composition, no selection of suitable background correction points, and no mathematical assumption of the background distribution. We further present a multi-scale based novel spectra alignment methodology useful in a functional analysis of variance method for identifying proteins that are differentially expressed between different type tissues. Our approaches are compared with several existing approaches in the recent literature and are tested on simulated and some real data. The results indicate that the proposed schemes enable accurate diagnosis based on the over-expression of a small number of identified proteins with high sensitivity

    Classification based on extensions of LS-PLS using logistic regression: application toclinical and multiple genomic data

    Get PDF
    International audiencePrediction from high-dimensional genomic data is an active field in today's medical research. Most of the proposed prediction methods make use of genomic data alone without considering established clinical data that often are available and known to have predictive value. Recent studies suggest that combining clinical and genomic information may improve predictions. We consider in this paper methods for classification purposes that simultaneously use both types of variables, but applying dimension reduction only to the high-dimensional genomic ones. A usual way to deal with that is the use of a two-step approach. In step one, dimensionality reduction technique is just performed on the genomic dataset. In step two, the selected genomic variables are merged with the clinical variables to build a classification model on the combined dataset. Nevertheless, the reduction dimension is built without taking into account the link between the response variable and the clinical data. To address this issue, using Partial Least Squares (PLS) as reduction technique, we propose here a one step approach based on three extensions of LS-PLS (LS for Least Squares) method for logistic regression context. We perform a simulation study to evaluate these approaches compared to methods using only the clinical data or only genetic data. Then, we illustrate their performances to classify two real data sets containing both clinical information and gene expression

    Confidence Intervals for Adaptive Regression Estimation on the Besov Spaces

    Get PDF
    The problem of adaptive estimation of the regression function f from noisy observations is concerned. A confidence interval for the L_2-error for wavelet adaptive estimator is provided. We show that if f belongs to a Besov class, the proposed confidence interval is minimax
    corecore